Fuzzy Syntactic Reordering for Phrase-based Statistical Machine Translation
نویسندگان
چکیده
The quality of Arabic-English statistical machine translation often suffers as a result of standard phrase-based SMT systems’ inability to perform long-range re-orderings, specifically those needed to translate VSO-ordered Arabic sentences. This problem is further exacerbated by the low performance of Arabic parsers on subject and subject span detection. In this paper, we present two parse “fuzzification” techniques which allow the translation system to select among a range of possible S–V re-orderings. With this approach, we demonstrate a 0.3-point improvement in BLEU score (69% of the maximum possible using gold parses), and a corresponding improvement in the percentage of syntactically well-formed subjects under a manual evaluation.
منابع مشابه
Tree Kernel-based SVM with Structured Syntactic Knowledge for BTG-based Phrase Reordering
Structured syntactic knowledge is important for phrase reordering. This paper proposes using convolution tree kernel over source parse tree to model structured syntactic knowledge for BTG-based phrase reordering in the context of statistical machine translation. Our study reveals that the structured syntactic features over the source phrases are very effective for BTG constraint-based phrase re...
متن کاملPhrase Reordering Model Integrating Syntactic Knowledge for SMT
Reordering model is important for the statistical machine translation (SMT). Current phrase-based SMT technologies are good at capturing local reordering but not global reordering. This paper introduces syntactic knowledge to improve global reordering capability of SMT system. Syntactic knowledge such as boundary words, POS information and dependencies is used to guide phrase reordering. Not on...
متن کاملSyntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation
Syntactic Reordering of the source language to better match the phrase structure of the target language has been shown to improve the performance of phrase-based Statistical Machine Translation. This paper applies syntactic reordering to English-to-Arabic translation. It introduces reordering rules, and motivates them linguistically. It also studies the effect of combining reordering with Arabi...
متن کاملA Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation
This paper explores a simple and effective unified framework for incorporating soft linguistic reordering constraints into a hierarchical phrase-based translation system: 1) a syntactic reordering model that explores reorderings for context free grammar rules; and 2) a semantic reordering model that focuses on the reordering of predicate-argument structures. We develop novel features based on b...
متن کاملPractical Approach to Syntax-based Statistical Machine Translation
This paper presents a practical approach to statistical machine translation (SMT) based on syntactic transfer. Conventionally, phrase-based SMT generates an output sentence by combining phrase (multiword sequence) translation and phrase reordering without syntax. On the other hand, SMT based on tree-to-tree mapping, which involves syntactic information, is theoretical, so its features remain un...
متن کاملA Reordering Approach for Statistical Machine Translation
This paper presents a Markov based hierarchical reordering scheme for lexical reordering to incorporate into phrase-based statistical machine translation system. The goal is to reorder the words and phrases in source language syntactic structure into their corresponding target language syntactic order for making translation easy. Without reordering during language translation, sentences can onl...
متن کامل